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Abstract 



O 

O 
O 

, Random increasing fc-trees represent an interesting, useful class of strongly dependent graphs 

' for which analytic-combinatorial tools can be successfully applied. We study in this paper a notion 

, called connectivity-profile and derive asymptotic estimates for it; some interesting consequences 

will also be given. 

, 1 Introduction 

a^ . 

. A k-tree is a graph reducible to a /c-clique by successive removals of a vertex of degree k whose neigh- 

^ I bors form a A;-clique. This class of /c-trees has been widely studied in combinatorics (for enumeration 

and characteristic properties [5, 29]), in graph algorithms (many NP-complete problems on graphs can 
be solved in polynomial time on /c-trees [2]), and in many other fields where fc-trees were naturally en- 
Q\ [ countered (see [2]). By construction, vertices in such structures are remarkably close, reflecting a highly 

O ■ strong dependent graph structure, and they exhibit with no surprise the scale-free property [20], yet 

somewhat unexpectedly many properties of random A;-trees can be dealt with by standard combinatorial, 
^ , asymptotic and probabilistic tools, thus providing an important model of synergistic balance between 

^ ' mathematical tractability and the predictive power for practical-world complex networks. 

While the term "A;-trees" is not very informative and may indeed be misleading to some extent, they 
stand out by their underlying tree structure, related to their recursive definition, which facilitates the 
analysis of the properties and the exploration of the structure. Indeed, for k = I, fc-trees are just trees, 
and for > 2 a bijection [11] can be explicitly defined between fc-trees and a non trivial simple family 
of trees. 

The process of generating a /c-tree begins with a /c-clique, which is itself a k-tiee; then the fc-tree 
grows by linking a new vertex to every vertex of an existing fc-clique, and to these vertices only. The 
same process continues; see Figure 1 for an illustration. Such a simple process is reminiscent of several 
other models proposed in the literature such as /c-DAGs [13], random circuits [ ], preferential attach- 
ment [4, 7, 21], and many other models (see, for example, [6, 17, 25]). While the construction rule in 
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each of these models is very similar, namely, linking a new vertex to k existing ones, the mechanism of 
choosing the existing k vertices differs from one case to another, resulting in very different topology and 
dynamics. 




Figure 1: The first few steps of generating a 3-tree and a A-tree. Obviously, these graphs show the high 
connectivity of k-trees. 

Restricting to the procedure of choosing a /c-clique each time a new vertex is added, there are several 
variants of fc-trees proposed in the literature depending on the modeling needs. So A;-trees can be either 
labeled [5], unlabeled [22], increasing [ ], planar [ ], non-planar [ ], or plane [ ], etc. 

For example, the family of random Apollonian networks, coiTcsponding to planar 3-trees, has re- 
cently been employed as a model for complex networks [1, 32]. In these frameworks, since the exact 
topology of the real networks is difficult or even impossible to describe, one is often led to the study 
of models that present similarities to some observed properties such as the degree of a node and the 
distance between two nodes of the real structures. 

For the purpose of this paper, we distinguish between two models of random labeled non-plane Oc- 
trees; by non-plane we mean that we consider these graphs as given by a set of edges (and not by its 
graphical representation): 

- random simply-generated k-trees, which correspond to a uniform probability distribution on this 
class of /c-trees, and 

- random increasing k-trees, where we consider the iterative generation process: at each time step, 
all existing /c-cliques are equally likely to be selected and the new vertex is added with a label 
which is greater than the existing ones. 

The two models are in good analogy to the simply-generated family of trees of Meir and Moon [ ] 
marked specially by the functional equation f{z) = z^{f{z)) for the underlying enumerating generat- 
ing function, and the increasing family of trees of Bergeron et al. [10], characterized by the differential 
equation f'{z) = $(/(z)). Very different stochastic behaviors have been observed for these families of 
trees. While similar in structure to these trees, the analytic problems on random A;-trees we are dealing 
with here are however more involved because instead of a scalar equation (either functional, algebraic, 
or differential), we now have a system of equations. 

It is known that random trees in the family of increasing trees are often less skewed, less slanted in 
shape, a typical description being the logarithmic order for the distance of two randomly chosen nodes; 
this is in sharp contrast to the square-root order for random trees belonging to the simply-generated 
family; see for example [10, 14, 19, 23, 24]. Such a contrast has inspired and stimulated much re- 
cent research. Indeed, the majority of random trees in the literature of discrete probability, analysis 
of algorithms, and random combinatorial structures are either log n-trees or y^-trees, n being the tree 
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Expansion near singularity 


J^s\z} — T — I — z/ p + . . . 


J- \Z) — [L — KZ ) ' 


Mean distance of nodes 




O(logn) 


Degree distribution 


Power law with exp. tails 


Power law [ ] 


ivooL-uegiee uisiiiuuLioii 


rowei law wiui exp. lallS 


oLdUlC idW I^IIICUICIII / ) 


Expected Profile 


Rayleigh limit law 


Gaussian limit law (8) 



Table 1: The contrast of some properties between random simply-generated k-trees and random increas- 
ing k-trees. Here Z denotes a node and means a marked node. 

size. While the class of y^-trees have been extensively investigated by probabilists and combinatorial- 
ists, log n-trees are comparatively less addressed, partly because most of them were encountered not in 
probability or in combinatorics, but in the analysis of algorithms. 

Table 1 presents a comparison of the two models: the classes 7^ and T, corresponding respectively 
to simply-generated /c-trees and increasing fe-trees. The results concerning simple /c-trees are given 
in [11, 12], and those concerning increasing fc-trees are derived in this paper (except for the power 
law distribution ]). We start with the specification, described in terms of operators of the symbolic 
method [18]. A structure of 7^ is a set of k structures of the same type, whose roots are attached 
to a new node: Tg = Set(Z x T^), while a structure of T is an increasing structure, in the sense 
that the new nodes get labels that are smaller than those of the underlying structure (this constraint is 
reflected by the box-operator) T = Set(-E'-' x T'^). The analytic difference immediately appears in 
the enumerative generating functions that translate the specifications: the simply-generated structures 
are defined by Ts{z) = exp(zTg*^(z)) and corresponding increasing structures satisfy the differential 
equation T'{z) = T^{z). These equations lead to a singular expansion of the square-root type in the 
simply-generated model, and a singularity in (1 — kz)~^/^ in the increasing model. Similar analytic 
differences arise in the bivariate generating functions of shape parameters. 

The expected distance between two randomly chosen vertices or the average path length is one of the 
most important shape measures in modeling complex networks as it indicates roughly how efficient the 
information can be transmitted through the network. Following the same -y/n-vs-log n pattern, it is of 
order ^/n in the simply-generated model, but log n in the increasing model. Another equally important 
parameter is the degree distribution of a random vertex: its limiting distribution is a power law with 
exponential tails in the simply-generated model of the form d^^^'^pf., in contrast to a power-law in the 
increasing model of the form d^'^^^/ C^-i) ^ d denoting the degree [ ]. As regards the degree of the root, 
its asymptotic distribution remains the same as that of any vertex in the simply-generated model, but in 
the increasing model, the root-degree distribution is different, with an asymptotic stable law (which is 
Rayleigh in the case k = 2); see Theorem 7. 

Our main concern in this paper is the connectivity-profile. Recall that the profile of an usual tree is 
the sequence of numbers, each enumerating the total number of nodes with the same distance to the root. 
For example, the tree ■-==-=: — has the profile {1, 2, 2, 1, 3}. Profiles represent one of the richest shape 
measures and they convey much information regarding particularly the silhouette. On random trees, they 
have been extensively studied recently; see [ \ 15, 16, 19,21,23,27]. Since fc-trees have many cycles for 
/c > 2, we call the profile of the transformed tree (see next section) the connectivity-profile as it measures 
to some extent the connectivity of the graph. Indeed this connectivity-profile con^esponds to the profile 
of the "shortest-path tree" of a A;-tree, as defined by Proskurowski ['^''], which is nothing more than the 
result of a Breadth First Search (BFS) on the graph. Moreover, in the domain of complex networks. 
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Figure 2: A 2-tree (left) and its corresponding increasing tree representation (right). 

this kind of BFS trees is an important object; for example, it describes the results of the traceroute 
measuring tool [30, 31] in the study of the topology of the Internet. 

We will derive precise asymptotic approximations to the expected connectivity -profile of random in- 
creasing /c-trees, the major tools used being based on the resolution of a system of differential equations 
of Cauchy-Euler type (see [ - ]). In particular, the expected number of nodes at distance d from the root 
follows asymptotically a Gaussian distribution, in contrast to the Rayleigh limit distribution in the case 
of simply-generated A;-trees. Also the limit distribution of the number of nodes with distance d to the 
root will be derived when d is bounded. Note that when d = 1, the number of nodes at distance 1 to the 
root is nothing but the degree of the root. 

This paper is organized as follows. We first present the definition and combinatorial specifica- 
tion of random increasing A;-trees in Section 2, together with the enumerative generating functions, on 
which our analytic tools will be based. We then present two asymptotic approximations to the expected 
connectivity-profile in Section 3, one for d = o(log n) and the other for d — > oo and d = 0(log n). In- 
teresting consequences of our results will also be given. The limit distribution of the connectivity-profile 
in the range when d = 0(1) is then given in Section 4. 

2 Random increasing /c-trees and generating functions 

Since A;-trees are graphs full of cycles and cliques, the key step in our analytic-combinatorial approach is 
to introduce a bijection between /c-trees and a suitably defined class of trees {bona fide trees!) for which 
generating functions can be derived. This approach was successfully applied to simply-generated family 
of /c-trees in [i i], which leads to a system of algebraic equations. The bijection argument used there 
can be adapted mutatis mutandis here for increasing /c-trees, which then yields a system of differential 
equations through the bijection with a class of increasing trees [10]. 

Increasing fc-trees and the bijection. Recall that a A;-clique is a set of k mutually adjacent vertices. 

Definition 1 An increasing k-tree is defined recursively as follows. A k-clique in which each vertex gets 
a distinct label from {1, . . . , fc} is an increasing k-tree of k vertices. An increasing k-tree with n > k 
vertices is constructed from an increasing k-tree with n — 1 vertices by adding a vertex labeled n and 
by connecting it by an edge to each of the k vertices in an existing k-clique. 

By random increasing k trees, we assume that all existing fc-cliques are equally likely each time a 
new vertex is being added. One sees immediately that the number T„ of increasing fc-trees of n -\- k 
nodes is given by T„ = no<i<n(^^ + 
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Note that if we allow any permutation on all labels, we obtain the class of simply-generated A;-trees 
where monotonicity of labels along paths fails in general. 

Combinatorially, simply-generated fc-trees are in bijection [11] with the family of trees specified by 
ICs = X Tg, where Tg = Set(Z x T^). Given a rooted A;-tree G of n vertices, we can transform G 
into a tree T, with the root node labeled {1, . . . , fc}, by the following procedure. First, associate a white 
node to each fc-clique of G and a black node to each {k + l)-clique of G. Then add a link between each 
black node and all white nodes associated to the A:-cliques it contains. Each black node is labeled with 
the only vertex not appearing in one of the black nodes above it or in the root. The last step in order 
to complete the bijection is to order the k vertices of the root and propagate this order to the k sons of 
each black node. This constructs a tree from a A;-tree (see Figure 2); conversely, we can obtain the fc-tree 
through a simple traversal of the tree. 

Such a bijection translates directly to increasing fc-trees by restricting the class of corresponding 
trees to those respecting a monotonicity constraint on the labels, namely, on any path from the root 
to a leaf the labels are in increasing order. This yields the combinatorial specification of the class of 
increasing trees T = Set(Z^ x T^). An increasing A;-tree is just a tree in T together with the sequence 
{1, . . . , fc} corresponding to the labels of the root-clique' . A tree in /C is thus completely determined by 
its T component, giving KLn+k = %i- For example figure 2 shows a 2-tree with 19 vertices and its tree 
representation with 17 black nodes. In the rest of this paper we will thus focus on class T. 

Generating functions. Following the bijection, we see that the complicated dependence structure of 
fc-trees is now completely described by the class of increasing trees specified by T = Set(-E'-' x T^). 
For example, let T{z) := Yln>o '^nz'^/nl denote the exponential generating function of the number T„ 
of increasing /c-trees of n + k vertices. Then the specification translates into the equation 



where the coefficients nl[u z'^]T{z, u) denote the number of increasing A;-trees of size n + k with root 
degree equal to A; + ^ — 1 . Taking derivative with respect to z on both sides and then solving the equation, 
we get the closed-form expression 



Since A;-trees can be transformed into ordinary increasing trees, the profiles of the transformed trees 
can be naturally defined, although they do not correspond to simple parameters on fc-trees. While the 
study of profiles may then seem artificial, the results do provide more insight on the structure of random 
fc-trees. Roughly, we expect that all vertices on fc-trees are close, one at most of logarithmic order away 
from the other. The fine results we derive provide in particular an upper bound for that. 

Let Xn;d,j denote the number of nodes at distance d from j vertices of the root-clique in a random 
/c-tree ofn + k vertices. Let T^jiz, u) = X]„>o r„E(?x"^"^<*.j)z"/n! denote the con-esponding bivaiiate 
generating function. 

'We call root-clique the clique composed by the k vertices (1, . . . , fc). The increasing nature of the fc-trees guarantees that 
these vertices always form a clique. We call root-vertex the vertex with label 1. 




or, equivalendy, T'{z) = T^^^{z) with r(0) = 1, which is solved to be 

T{z) = {i-kzr^i\ 

we then check that T„ = no<i<n(*^ + 

If we mark the number of neighbors of the root-node in T by u, we obtain 





(1) 
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Theorem 1 The generating functions T^j 's satisfy the differential equations 

d_ 
dz 



— rrf,,(z, u) = n^'^.^rj u)T^r^+\z, n), (2) 



with the initial conditions Tii j{0,u) = 1 for 1 < j < k, where 5a,b denotes the Kronecker function, 

To^k{z,u) = T{z) andTdfl{z,u) = Td-i,k{z,u). 

Proof. The theorem follows from 

Td,j{z, u) = exp {u^'^'^ ^ ' 

WithTd,j{z,l)=T{z). □ 

For operational convenience, we normalize all zhy z/k and write T{z) := T{z/k) = (1 — z)"^/*^. 
Similarly, we detine Tdj{z, u) := Tdj{z/k, u) and have, by (2), 

—Td,j{z,u) = —^Tl._-^{z,u)T^/ {z,u), (3) 
withTrf,j(l,z) = f{z), \k{z,u) = f{z) andfrf,o(^,u) = fd-i,k{z,u). 

3 Expected connectivity-profile 

We consider the expected connectivity -profile E(X„.rf j ) in this section. Observe first that 



k^[z^]MdA^) 



Tn 



where Mdj (z) := dTdj {z, u) / [du)\u=i . It follows from (3) that 

^'"'^^'^ = kiT^) (^^ " ^ + l)^^'^'^^^) + 3Md,-i{z) + 5d,if{z)) . 
This is a standard differential equation of Cauchy-Euler type whose solution is given by (see [9]) 

Md,j{z) = ^ (1 - (jM,,,_i(x) + 5d,iT{x)) dx, 

since Md^j{Q) = 0. Then, starting from Afo,fc = 0, we get 

Then by induction, we get 



So we expect, by singularity analysis, that 



3 ( logn)'^ 
(d 



(4) 



for lai^ge n and fixed d, and 1 < j < k. We can indeed prove that the same asymptotic estimate holds 
in a lai^ger range. 



Theorem 2 The expected connectivity-profile E(X„.^ j) satisfies for 1 < d = o(logn) 

- nm^^ ■ ^^^^ (5) 

uniformly in d, and for d — > oo, d = 0(log n), 

T{l/k)h,,,{p)p-''n^^^P)-y^ 
"^''^^ ^ r(Ai(p))V2vr(pA;(p) + p^A'/(p))logn 

where p = pn,d > solves the equation pXi{p) = d/ log n, Ai(i(;) largest zero (in real part) 

of the equation ni<^<fe(^ ~ ^/^) ~ klw/k^ = and satisfies Ai(l) = {k + l)/k. 

An explicit expression for the /ij,i's is given as follows. Let Xi{w), . . . , Xk{w) denote the zeros of the 
equation lli<£<ki^ - ^/k) - klw/k^ = 0. Then for 1 < j < A; 

hj^i{w) = -, r . (7) 



<s<k k\i{w)-s ) nfc-i+l<s<fc+l(^^l('"^ 

The theorem cannot be proved by the above inductive argument and our method of proof consists of 
the following steps. First, the bivariate generating functions ^j{z^ w) := Yld>i ^d,j{z)w'^ satisfy the 
lineal- system 

, d k — j + 1\ i' ^ wT 

" '^Tz k — j = k^^~^ + — (1 < J < fc)- 

Second, this system is solved and has the solutions 

^,{Z,W)= Y: h,,m{w){l-z)-'-(-^- ''-^''-^^^''^ f{z), 

i<i<fc 

where the hj^m have the same expression as /ij i but with all Ai(?i;) in (7) replaced by A,„(w). While 
the form of the solution is well anticipated, the hard part is the calculations of the coefficient-functions 
hjm- Third, by singularity analysis and a delicate study of the zeros, we then conclude, by saddle-point 
method, the estimates given in the theorem. 

Corollary 3 The expected degree of the root E(X„^ij) satisfies 

E(X„,i,,) - T{l/k)^-^ n^-V^ (1 < i < k). 

This estimate also follows easily from (1). 

Let Hk := "^^KKk ^/^ denote the harmonic numbers and := Ei<£<;j 1/^^. 



Corollary 4 The expected number of nodes at distance d 



-rjT- log n + xay/log n 



from the root, 



where a = y /{kH^), satisfies, uniformly for x = o((log n)^/^), 

-3:2/2 



71 P 

mn;d,j) ~ - ■ (8) 

A/ivro"^ log n 
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This Gaussian approximation justifies the last item coiTesponding to increasing trees in Table 1. 
Note that Ai(l) = {k + l)/k and a = d/ log n ~ l/{kHk). In this case, p = 1 and 



pAi(p) 



1 



El 



1 ' 



<i<k Ai(p)-| 



which imphes that Xi{p) — 1/k — a log p 1. 



Corollary 5 Let J^n\d.j ■= max^ -^n-^j denote the height of a random increasing k-tree of n + k 
vertices. Then 

E(jr„) < a+ log n - \ ^ log log n + 0(1), 

2(Ai(a+) - ^) 

where a+ > is the solution of the system of equations 



-=y ^ 

+ i<i<k ^ k 

a+ ^ log 



1 

k 



k 



:V- 1 



0. 



i<e<k 

Table 2 gives the numerical values of for small values of k. For large k, one can show that 



k 


2 


3 


4 


5 


6 




1.085480 


0.656285 


0.465190 


0.358501 


0.290847 


k 


7 


8 


9 


10 


20 




0.244288 


0.210365 


0.184587 


0.164356 


0.077875 



Table 2: Approximate numerical values o/a+. 

l/(A;log2) and Ai(a+) ~ 2. 

Corollary 5 justifies that the mean distance of random /e-trees are of logarithmic order in size, as 
stated in Table 1 . 



Corollary 6 The width '^n;d,j '■= max^^ ^n;d,j bounded below by 

¥.{Wn) = E(maxX„,rf) > maxE(X„,d) > 



ylog 



n 



We may conclude briefly from all these results that in the transformed increasing trees of random 
increasing k-trees, almost all nodes are located in the levels with d = -^j-^ logn + 0{^J\og n), each 
with n/\/log n nodes. 

4 Limiting distributions 

With the availability of the bivariate generating functions (2), we can proceed further and derive the limit 
distribution of Xn;d,j in the range where d = 0(1). The case when d — > oo is much more involved; we 
content ourselves in this extended abstract with the statement of the result for bounded d. 

Theorem 7 The random variables Xn;d,j, when normalized by their mean orders, converge in distribu- 
tion to 

^-■'"'^ ' (9) 



ni-iA(logn)'^-V((i- 1) 
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where 

E(e=..«) = r(i) E 



m!r(m(l - 1/k) + l/k) 



r(i) /■(0+) 



e T 



27rz 

and Cdj{u) := 1 + X^„>i Cd.j,m^^™/"^! satisfies the system of differential equations 

{k - l)nC^,,(n) + CdAu) = Cd,Mf^^~'Cd,,^i{^y (1 < i < k), (10) 

w/f/i Crf^o = Cd-i,k- Here the symbol denotes any Hankel contour starting from — oo on the real 

axis, encircling the origin once counter-clockwise, and returning to — oo. 

We indeed prove the convergence of all moments, which is stronger than weak convergence; also the 
limit law is uniquely determined by its moment sequence. 

So far only in special cases do we have explicit solution for Cij: Ci^i(u) = (1 + u)~^^^^~^^ and 



l+u ' 

1 ^ _ 3 



Note that the result (9) when d = can also be derived directly by the explicit expression (1). In 
particular, when k = 2, the limit law is Rayleigh. 
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